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Abstract 

Two goodness-of-fit tests for copulas are being investigated. The first one deals with the case of 
elliptical copulas and the second one deals with independent copulas. These tests result from the 
expansion of the projection pursuit methodology we will introduce in the present article. This 
method enables us to determine on which axis system these copulas lie as well as the exact value 
of these very copulas in the basis formed by the axes previously determined irrespective of their 
value in their canonical basis. Simulations are also presented as well as an application to real 
datasets. 
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Outline of the article 

The need to describe the dependency between two or more random variables triggered the 
concept of copulas. Let us consider a joint cumulative distribution function (cdf) F on M.'' and 
let us consider its cdf margins Fi, F2, ■■■,Fd, then a copula C is a function such that 

F = C(Fi,F2,...,Frf). 



Sklad (Il959h is the first to have established the bases of this new theory. Several parametric fam- 
ilie s of copula s ha ve since been defined, namely elliptical, archimedean, periodic copulas etc - 



Jod (Il997h and lNelsenI ( 1200 6) as well as appendix|A|for an overview of these families. 



Finding criterias to determine the best copula for a given problem can only be achieved through 
a goodness-of-fit (GOF) approach. 



Several GOF copula approaches have so far bee n proposed in the literature, e.g. Carriere (1994|j 

Genest and R emillardM2004V'FermanianV2005').'Genest Ouessv and Remillard (2006), Michiels and De Scheppej 

( 20081), Genest Favre Beliveau and Jacques (2009), Mesfioui Ouessv and Toupin (2009), lGenest Remillard and Beaudoin 
(I2OO9-2I) , lBer3 (l2009D , IBucher and Dettd (120101) , among others. However, the field is still at an 
embryonic stage which explains the current shortage in recommendations. In univariate distri- 
butions, the GOF assessment can be performed using for instance the well-known Kolmogorov 
test. In the multivariate field, there are fewer alternatives. A simple way to build GOF approaches 
for multivar iate random variables is to consider multi-dimensional chi-square approaches, as in 
for example iBroniatowskil (|2006|) . However, these approaches present feasibility issues for high 
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dimensional problems due to the curse of dimensionality. In order to solve this, we will now 
introduce the theory of projection pursuit. 

The objective of projection pursuit is to generate one or several projections providing as much 
information as possible about the structure of the dataset regardless of its size. 
Once a structure has been isolated, the corresponding data are transformed through a Gaussian- 
ization. Through a recursive approach, this process is iterated to find another structure in the 
remaining data, u ntil no futhe r structure can be evidenced in the data left at the end. 



FriedmanI (119841) and iHuben (11985) count among the first authors who introduced this type of 



approaches for evidencing structures. They each describe, with many examples, how to evidence 
such a structure and consequently how to estimate the density of such data through two different 
methodologies each. Their work is based on maximizing Kullback-Leibler divergence. 
In the present article, we will introduce a new projection pursuit methodology based on the min- 
imisation of any 0-divergence greater than the L'- distance (0-PP). As we will develop later on, 
this way of implementing this methodology encompasses all other previous methods. This algo- 
rithm also presents the extra advantage of being more robust and more rapid from a numerical 
standpoint. Finally, this process allows not only to carry out GOF tests for elliptical and inde- 
pendent copulas but also to determine the axis system upon which these very copulas are based. 
It will also enable us to derive the exact expression of these copulas in the basis constituted by 
these axes. 

This paper is organised as follows : section [T]contains preliminary definitions and properties. 
In section |2] we present in details the ^-projection pursuit algorithm. In section |3] we present 
our first results. In section 2] we introduce our tests. In section |5] we provide two simulations 
pertaining to the two major situations described herein and we will study a real case. 



1. Basic theory 

1.1. An introduction to copulas 

In this section, we will introduce the concept of copula. We will also define the family of 
elliptical copulas through a brief reminder of elliptical distributions - see appendix |A] for an 
overview of other families. 



7.7.7. Sklar's theorem 

First, let us define a copula in K'' 

Definition 1.1. A d-dimensional copula is a joint cumulative distribution function C defined on 
[0, \ Y, with uniform margins. 

Moreover, the following theorem explains in what extent a copula does describe the depen- 
dency between two or more random variables. 



Theorem 1.1 (ISklarl (Il959h \ Let F be a joint multivariate distribution with margins F\,..., Fj, 
then, there exists a copula C such that 

F(xu...,Xd)^C{Fi(xi),...,FAxd))- (1.1) 

Moreover, if marginal cumulative distributions are continuous, then the copula is unique. Other- 
wise, the copula is unique on the range of values of the marginal cumulative distributions. 
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^Remark 1.1. First, for any copula C and any Uj in [0, 1], I < i < d, we have 

d 



W(ui,. . .,Ud) - max|l - d + J] Ui,Oj < C{u\, . ..,Ud) < minyE{i_..._rf) Uj - M{u\, . . ., uj), 

where W and M are called the Frechet-Hoeffding copula boundaries and are also copulas. 
Moreover, we define the independent copula 11 as Y\{ui, . . . , u^) = H^ jM,-, /or any u; in [0, 1], 
\ <i<d. 

Finally, we define the density of a copula as the density associated with the cdf C, that we 
will name c: 

Definition 1.2. Should it exist, the density of C is defined by c{u], Ud) — ^^^ C(ui, u^), 
for any m,- in [0, 1], I < i < d. 

1.1.2. The Gaussian copula 

The Gaussian copula can be used in several fields. For example, many credit models are 
built from this copula, which also present s the prope rty to make extreme values (minimal or 



maximal) independent - in the limit ; see IJog (Il997h for more details. For example, in R , 
it is derived from the bivariate normal distribution and from Sklar's theorem. Defining as 
the standard bivariate normal cumulative distribution function with p correlation, the Gaussian 
copula function is Cp(u, v) = ['i'-\u), 

where u, v e [0, 1] and where is the standard 
normal cumulative distribution function. Then, the copula density function is : 

>^xj,p('i'-'(u),'i'-\v)) 



Cp{u, V) : 



ifr(^-Hu))^(^-\v)) 



where if/xxpix,y) = ^ - exp (~ 2(i-p^) [^^ + ~ 2pxyj^ is the density function for the stan- 
dard bivariate Gaussian with pearson product-moment correlation coefficient p and where tfr is 
the standard normal density. This definition can obviously be extended to R''. 

1.1.3. The elliptical copula 

Let us begin with defining the class of elliptical distributions and its properties - see also 
ICambanis (1981), Landsman (2003) : 

Definition 1.3. X is said to abide by a multivariate elliptical distribution, denoted X ~ E^iji, E, ^4), 
ifX has the following density, for any x in M.'' : 

• where "L is a d X d positive-definite matrix and where p is a d-column vector, 

• where is referred as the "density generator", 

• where aa is a normalisation constant, such that — |^^( Jq x'^l^^^^d{x)dx^ , 
with x^l^'^^d{x)dx < 00. 

Property 1.1. 1/ For any X ~ Edijd, Z, ^d), for any mxd matrix with rank m < d. A, and for any 
m-dimensional vector b, we have AX + b ~ E^iAp + b, AT,A', ^m)- 
Therefore, any marginal density of multivariate elliptical distribution is elliptical, i.e. 
X = (XuX2, ..., X,)~ Ed(u,I, ^ Xi ~ Ei(fii,cr^,^i), 1 < i < d, with fx,{x) = 



2/ Corollary 5 o uCambanisl i\1981\) states that conditional densities with elliptical distributions 



are also elliptical. Indeed, ifX — (Xi,X2)' ~ Ed{^i,'^,^d), with X\ (resp. X2) of size di < d 
(resp. d2 < d), then Xi/(X2 — a) ~ Ej^ip' with p' — pi + 212^22 ('^ ~ A'z) ond 
2' = Zii - Zi2l22^2i, withp = (Ml, Pi) andl, = (S,7)i</j<2- 
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'Remark 1.2. \Landsmam ( 120031) shows that multivariate Gaussian distributions derive from 
^d(x) — e^^. They also show that if X — (Xi,...,Xd) has an elliptical density such that its 
marginals verify E(Xi) < oo and E(Xf) < oa far I < i < d, then fi is the mean ofX and S is a 
multiple of the covariance matrix ofX. Consequently, from now on, we will assume this is indeed 
the case. 

Definition 1.4. Let t be an elliptical density on R* and let q be an elliptical density on M*'. 
The elliptical densities t and q are said to belong to the same family of elliptical densities, if 
their generating densities are <^nd ^k' respectively, which belong to a common given family of 
densities. 

Sxample 1.1. Consider two Gaussian densities N(0, 1) and N((0,0), Id2). They are said to 
belong to the same elliptical family as they both present x i— » e"* as generating density. 

Finally, let us introduce the definition of an elliptical copula which generalizes the above 
overview of the Gaussian copula : 

Definition 1.5. Elliptical copulas are the copulas of elliptical distributions. 

1.2. Brief introduction to the (p-projection pursuit methodology ((p-PP) 
Let us first introduce the concept of ^-divergence. 

1.2.1. The concept of (p— divergence 

Let ^ be a strictly convex function defined by : M+ — > M+, and such that ip{Y) - 0. We 
define a (/(-divergence of P from Q - where P and Q are two probability distributions over a 
space Q such that Q is absolutely continuous with respect to f - by 



or D^(q, p) - j 'P(j^j)p(x)dx, if P and Q present p and q as density respectively. 

Throughout this article, we will also assume that ip(Q) < oo, that if' is continuous and that this 

divergence is greater than the L' distance - see also Appendix lB]pagel2n 

7.2.2. Functioning of the algorithm 

Let / be a density on R''. We define an instrumental density g with the same mean and 
variance as /. We start with performing the D^{g,f) - test; should this test turn out to be 
positive, then f - g and the algorithm stops, otherwise, the first step of our algorithm consists in 
defining a vector a\ and a density ^"^'^ by 

ai = arg inf D^(g^J) andg('> = g^ (1.2) 

aeRi ga ga, 

where is the set of non null vectors of R"' and fa (resp. ga) stands for the density of a^X (resp. 
a^Y) when / (resp. g) is the density of X (resp. Y). 

In our second step, we will replace g with g*^'* and we will repeat the first step. 
And so on, by iterating this process, we will end up obtaining a sequence {ai,a2, .-.) of vectors 
in R^ and a sequence of densities g'-'K We will thus prove that the underlying structures of 
/ evidenced through this method are identical to the ones obtained through projection pursuit 
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methodologies based on Kullback-Leibler divergence maximisation, such as Huber's method - 
see appendix lE.3l We will also evidence the above structures, which will enable us to infer more 
information on / - see example below. 

'Remark 1.3. First, to obtain an approximation of f, we stop our algorithm when the divergence 
equals zero, i.e. we stop when D^{g^j\ f) — since it implies g^^'^ — f with j < d, or when our 
algorithm reaches the d'^ iteration, i.e. we approximate f with g^'^\ 
Second, we getD^(g^°\f) > D^(g^^\f) > > with g^"> = g. 

Finally, the specific fonn of the relationship ( 17.21 ) establishes that we deal with M-estimation. We 
can therefore state that our method is more robust than projecti o n pursuit me thodologie s based 



on Ku llback-Leibler divergence maximisation - see uohai (|200S|), \Toma as well asmuben 



mm. 



At present, let us study the following example: 

Sxample 1.2. Let f be a density defined on M.^ by f{x\,X2, JC3) = n(xi, X2)h(x2), with n being a 
bi-dimensional Gaussian density, and h being a non Gaussian density. Let us also consider g, a 
Gaussian density with the same mean and variance as f. 

Since g(xi,X2/x3) = n(xi,X2), we have D^{g^^,f) = D^{n.f,f) = D^{f,f) = Q as f^ = h, 

i.e. the function a 1— » Dip(gj-,f) reaches zero for 63 — (0,0, 1)' - where f-j and gj are the third 
marginal densities of f and g respectively. We therefore obtain g(xi, JC2/X3) = f(xi, X2/X3). 

To recapitulate our method, if D^{g,f) - 0, we derive / from the relationship f - g', should 
a sequence (a,),=i,...j, j < d, of vectors in defining g*-'* and such that D^{g^-'\ /) = exist, then 
f{./ajx, I < i < f) - g{./ajx, 1 < i < j), i.e. / coincides with g on the complement of the vector 
subspace generated by the family {fl,),=i,...j - see also section|2]for a more detailed explanation. 

In the remaining of the study of the algorithm, after having clarified the choice of g, we will 
consider the statistical solution to the representation problem, assuming that / is unknown and 
that Xi, X2,... X,n are i.i.d. with density /. We will provide asymptotic results pertaining to 
the family of optimizing vectors at ,,, - that we will define more precisely below - as m goes to 
infinity. Our results also prove that the empirical representation scheme converges towards the 
theoretical one. 



2. The algorithm 

2.7. The model 

Let / be a density on W'. We assume there exists d non null linearly independent vectors aj, 
with 1 < j <d,of W', such that 

f{x) — n{a^j_^^x, a^x)h{aj X, a^j x) (2.1) 

with j < d, n being an elliptical density on W'^^ and with h being a density on W, which does 

not belong to the same family as n. Let X - (Xi , Xi/) be a vector with / as density. 

We define g as an elliptical distribution with the same mean and variance as /. 

For simplicity, let us assume that the family {flj)i<j<rf is the canonical basis of R'': 

The very definition of / implies that (Xj+i, ■.■,Xd) is independent from (Xi, ...jXj). Hence, the 

density of {Xj+\, .■.,Xd) given (Xi, ...,Xj) is n. 

Let us assume that DAg^^\f) - 0, for some j < d. We then get , l^''^'' , - ,^_y. f^^^^ — rpry, since, 
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by induction, we have g^J^(x) = gix)-;^-^...-^. 

§a [ Sa2 Sa j 

Consequently, through lemma IF. 9 1 and through the fact that the conditional densities with el- 
liptical distributions are also elliptical, as well as through the above relationship, we can infer 
that 

n{aj^^x, .,fljx) = f{./ajx, 1 < i < j) ^ g{-/ajx, I <i< j). 
In other words, / coincides with g on the complement of the vector subspace generated by the 
family {fl,),=i y. 

Now, if the family {aj]\<j<ci is no longer the canonical basis of R"', then this family is again a 
basis of M''. Hence, lemma lR2l implies that 

g(./ajx, ajx) — n(aj^jji', a^x) — f(./ajx, ajx) (2.2) 

which is equivalent to D^(g^-'\f) - 0, since by induction g^j^ - g-^-^...-^. 

The end of our algorithm implies that / coincides with g on the complement of the vector sub- 
space generated by the family {a,),=i,...,;. Therefore, the nullity of the 0-divergence provides us 
with information on the density structure. 

In summary, the following proposition clarifies our choice of g which depends on the family of 
distribution one wants to find in / : 

Proposition 2.1. With the above notations, D0(g^^\f ) — is equivalent to 

g(./ajx, fljx) = f(./ajx, fljx) 

More generally, the above proposition leads us to defining the co-support of / as the vector 
space generated by the vectors ai, ...,aj. 

Definition 2.1. Let f be a density on W' . We define the co-vectors of f as the sequence of vectors 
a\,...,aj which solves the problem D^(g^-'\f) — where g is an elliptical distribution with the 
same mean and variance as f. We define the co- support of f as the vector space generated by 
the vectors Oj. 

"Remark 2.1. Any (af) family defining f as in f l2.il ), is an orthogonal basis o/R"' - see lemma 

2.2. Stochastic outline of our algorithm 

Let Xi, X2,..,X„, (resp. Yi, Y2, ..,¥,„) be a sequence of m independent random vectors with 
the same density / (resp. g). As customary in nonparametric ^-divergence optimizations, all 
estimates of / and /„, as well as all uses of Monte Carlo methods are being performed using sub- 
samples Xi, X2,..,X„ and Fi, Y2,..,Y„ - extracted respectively from Xi, X2,..,Xm and Fi, Y2, ■.,¥,„ 
- since the estimates are bounded below by some positive deterministic sequence 0,„ - see Ap- 
pendix|Cl 

Let P„ be the empirical measure based on the subsample Xi, X2,.,X„. Let /„ (resp. for any 
a in Rf) be the kernel estimate of / (resp. fa), which is built from Xi, X2,..,X„ (resp. a^Xi, 
a'^X2,..,a'^X„). 

As defined in section [L2l we introduce the following sequences {ak)k>\ and (g^*')/t>i: 

fa 

• fli: is a non null vector of R'' such that Gij: - argrmnDAg^'^^^^ n'\^ 'f) (2-3) 

fa 

• g**^' is the density such that - g^^'-'-r^ with g*"* = g 

Sat 



The stochastic setting up of the algorithm uses /, and g„ - g instead of / and ' - g, since 
g is known. Thus, at the first step, we build the vector oi which minimizes the (^-divergence 
between /„ and g^^ and which estimates a^. 

Proposition IC. 1 1 and lemma |R8] enable us to minimize the 0-divergence between /„ and g^^- 
Defining oi as the argument of this minimization, proposition ^ . 3 I shows us that this vector tends 

to fli. 

Finally, we define the density g^^ as - g^^ which estimates g^^^ through theorem ITTI 

^"1 

Now, from the second step and as defined in section [L2l the density g*^* is unknown. Conse- 
quently, once again, we have to truncate the samples. 

All estimates of / and fa (resp. g*^'* and gj,'') are being performed using a subsample Xi, X2,..,X„ 
(resp. y['*, r^",..,y^") extracted from Xu X2,..,X„ (resp. - which is a sequence 

of m independent random vectors with the same density g*'') such that the estimates are bounded 
below by some positive deterministic sequence 6„ (see AppendixO. 

Let P„ be the empirical measure based on the subsample Xi,X2,..,Xn. Let /„ (resp. g^^\ fa^n, gal 
for any a in ) be the kernel estimate of / (resp. and fa as well as ') which is built from 
Xi, X2,..,X„ (resp. Yf \ F^^',..,^^* and a^Xi, a^X2,..,a^Xn as well as a^Y^^\ fl^F*'\..,a'^y*"). 
The stochastic setting up of the algorithm uses /„ and g^'' instead of / and g^^\ Thus, we build 
the vector 02 which minimizes the 0-divergence between /„ and gJ/'4!7 ' since g*'* and g*/' are 
unknown - and which estimates 02. Proposition IC. 1 1 and lemma |R8] enable us to minimize the 
0-divergence between f„ and ^ . Defining 02 as the argument of this minimization, proposi- 

tion l3.3l shows that this vector tends to 02 inn. Finally, we define the density ' as ^^^^ = gn^^ijr 

which estimates g^^^ through theorem ITT] 

And so on, we will end up obtaining a sequence {0.1,02, ■■■) of vectors in estimating the co- 
vectors of / and a sequence of densities ig^^^)k such that ^,*f* estimates g**' through theorem ITT] 

3. Results 

3.1. Hypotheses on f 

Let Xi, X2,..,Xi„ (resp. Yi, ¥2, ..,¥,„) be a sequence of m independent random vectors with 
the same density / (resp. g). As customary in nonparametric (;A-divergence optimizations, all 
estimates of / and fa as well as all uses of Monte Carlo methods are being performed using sub- 
samples Xi, X2,..,X„ and Yi, Y2,..,Y„ - extracted respectively from Xi, X2,..,X,„ and Fi, Y2,..,Y,„ - 
since the estimates are bounded below by some positive deterministic sequence 0„, - see appendix 


Let P„ be the empirical measure of the subsample Xi, X2,.,X„. Let /„ (resp. fa.„ for any a in Mf) 

be the kernel estimate of / (resp. fa), which is built from X\,X2,..,X„ (resp. a^Xi, a^X2,..,a^X„). 

At present, let us define the set of hypotheses on /. 

Discussion on several of these hypotheses can be found in appendix iP] 

In the remaining of this section, to be more legible we replace g with g**"''. Let 

M(b,a,x) = J^'if^^M0^U-)M0^dx- ^-(^'(illg;!)), 

PnM(b, fl) = / M(b, a, x)dF„, PM(b, a) ^ J M(b, a, x)dP, 
where P is the probability measure presenting / as density. 
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Similarly as in chapter V of Van der VaartI ( 1998 1, let us define 



(HI) : For all e > 0, there is 77 > 0, such that for all c € 0^* verifying ||c - OkW > s, 

we have PM(c, a) - rj > VM{ak, a), with a e ©. 
iH2) : 3 Z < 0, «() > such that (n > no => sup^^g, sup^.g|QO^ P„M(c, a) < Z) 
(H3) : There exists V, a neighbourhood of a^, and H, a positive function, such that, for all c e V, 

we have |M(c, Ok, x)\ < H(x)(F - a.s.) with FH < 00, 
(H4) : There exists V, a neighbourhood of a^, such that for all e, there exists a 77 such that for all c e V 

and fl e 0, verifying ||a - a^H > e, we have PM{c, aj) < PM(c, a) - 77. 
Putting /flj = -^DJg—,/), let us consider now four new hypotheses: 

{H5) : P||^M(fli., fli.)|p and P||^M(flj:, 0^)1^ are finite and the expressions P-^^^M(ak,ak) and 

la. ' 

exist and are invertible. 
iH6) : There exists k such that PM(ak, a^) - 0. 
{HI) : (yarp(M(flji., flj;)))'''^ exists and is invertible. 

(HO) : / and g are assumed to be positive and bounded and such that K(g, f)>j \f{x) - g(x)\dx 
where K is the Kullback-Leibler divergence. 

3.1.1. Estimation of the first co-vector of f 

Let H be the class of all positive functions r defined on M and such that g(x)r{a''^ x) is a density 
on for all a belonging to . The following proposition shows that there exists a vector a such 
that Y minimizes D^(gr, f) in r: 

Proposition 3.1. There exists a vector a belonging to such that 

arg min DJgr, f) = — andr{a^x) — — ^ 

'■e'^ ga gaia'^x) 



"Remark 3.1. This proposition proves that a\ simultaneously optimises ( l£.7l ), ( IZs.21 ) and ( 17.21 ). In 
other words, it proves that the underlying structures of f evidenced through our method are iden- 
tical to the ones obtained through projection pursuit methodologies based on Kullback-Leibler 
divergence maximisation, such as Huber's methods - see appendix\E\ 



Following iBroniatowski (l2009l) . let us introduce the estimate of D^(g—,f„), through 
Mg^-^Jn) = / M{a,a,x)dFn{x) 

Proposition 3.2. Let a be such that a :— arg inf^^^^ D^(g^, f„). 

Then, a is a strongly convergent estimate of a, as defined in proposition \3.1\ 

Let us also introduce the following sequences (ajt)t>i and {g^^^)k>i, for any given n - see section 

• ak is an estimate of ak as defined in proposition l3.2l with instead of g, 

• 5* is snrh that (J^"^ - o rJ^rr^ - r'^''Vr^ ; „ xW(-„^ _ „(r\nk fy--'-^] ''^ 

• g„ issucntnatg„ -g,g„{x)-g„ Wgtmj.^^j^T^), i-e. g„ yx) - g{x}vi.^^ „(aTx) • 
We also note that ^ is a density. 
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3.1.2. Convergence study at the k step of the algorithm: 

In this paragraph, we show that the sequence {ak)„ converges towards a^ and that the sequence 
(gn^)n converges towards g^'^\ 

Let c„(fl) - arg snp^^Q P„M(c,a), with a e @, and y„ - arginfaee sup^^g P„M{c,a). We state 
Proposition 3.3. Both sup^^g l|c„(a) — ak\\ and y„ converge toward a^ a.s. 
Finally, the following theorem shows that gf^ converges almost everywhere towards g^'^^: 
Theorem 3.1. It holds |* g**^ a.s. 

3.1.3. Testing of the criteria 

In this paragraph, through a test of our criteria, namely a i-> Dtf,{g'^^ , /„), we will build 
a stopping rule for this procedure. First, the next theorem enables us to derive the law of our 
criteria: 

Theorem 3.2. For a fixed k, we have 

V^(yflrp(M(c„(f„),r„)))-i/2(p^M(c„(f«),r«) - PnM(ak,ak)) 7V(0,/X 
where k represents the k'^ step of our algorithm and where I is the identity matrix in W^. 

Note that k is fixed in theorem 13.21 since y„ - arg infog© sup^^© P„M(c, a) where M is a 
known function ofk- see section [STI Thus, in the case when D^{g^'^^^^ f) - 0, we obtain 

Corollary 3.1. We have ^/E(Varp{M{c„(f„), %)))-"-P„M{c„if„), f„) ^-4" yV(0, /). 

Hence, we propose the test of the null hypothesis 

(Ho) : D4g''''-''>4^,f) = versus the alternative (Hi) : D4/''-'^4^,f) + 0. 

Based on this result, we stop the algorithm, then, defining a]^ as the last vector generated, we 
derive from corollary 13. ll a a-level confidence ellipsoid around ai, namely 

£i = {/7 e R''; ^(Varv(M(b,b))y^^'^PnM(b,b) < ^^**'''*) 
where qa^^'^^ is the quantile of a c-level reduced centered normal distribution and where P„ is 
the empirical measure araising from a realization of the sequences (Xi, . . .,X„) and (Fi, . . . , Y„). 
Consequently, the following corollary provides us with a confidence region for the above test: 

Corollary 3.2. £>k is a confidence region for the test of the null hypothesis (Hq) versus (Hi). 



4. Goodness-of -fit tests 

4.1. The basic idea 

Let / be a density defined on Let us also consider g, a known elliptical density with the 
same mean and variance as /. Let us also assume that the family (a,) is the canonical basis of 
and that D^(g^^\f)^0. 

Hence, since lemma |R9] page l27l implies that - ga^ if j < d, we then have g^^^(x) = 

gi^)j[^ - gi^)jij;- Moreover, we get / with g*^) - ^s derived from propertv IB . 1 I page 1211 

Consequently, / - g(x)j[j-^^ i-e- = and then 

-^C - c 
dxdy ^ dxdy * 
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where C/ (resp. Cg) is the copula of / (resp. g). 

More generally, if / is defined on W', then the family (a,) is once again free - see lemma lRTOl page 
|27] -. i.e. the family (a,) is once again a basis of M.'^. The relationship D^{g^''\f) - therefore 

implies that g^"^ = /, i.e. for any x e W, f(x) = g^'\x) = g(x)Tll^ W^§^^ = gi^)^U 

since lemma |R9l page |27] implies that = if < li. In other words, for any x e W', it 

holds 

^(^) f(x) 



(4.1) 



Finally, putting A - {a\, aj) and defining vector y (resp. density /, copula C/ of /, density g, 
copula Cg of g) as the expression of vector x (resp. density /, copula C/ of /, density g, copula 
Cg of g) in basis A, then, the following proposition provides us with the density associated with 
the copula of / as being equal to the density associated with the copula of g in basis A : 

Proposition 4.1. With the above notations, should a sequence (fl,)/=i,...rf of not null vectors in Wl 
defining g^'^^ and such that D^(g^'^, /) = exist, then 

^ d'' - 
-Cf - —C, 



dy\...dyd dyx...dyd 

4.2. With the elliptical copula 

Let / be an unknown density defined on W' . The objective of the present section is to de- 
termine whether the copula of / is elliptical. We thus define an instrumental elliptical density g 
with the same mean and variance as /, and we follow the procedure of section l272l As explained 
in section 14711 we infer from proposition 14. 1 1 that the copula of / equals the copula of g when 
D^{g^'^\ f) - 0, i.e. when ad is the last vector generated from the algorithm and when (a,) is the 
canonical basis of W' . Thus, in order to verify this assertion, corollary 13.1 I page |9]provides us 
with a cK-level confidence ellipsoid around this vector, namely 

&d^[be M''; ^ffi{Varv{M{b,b))y^^^fnM{b,b) < q'^^°'^^] 

where q^^^'^^ is the quantile of a a-level reduced centered normal distribution, where P„ is the 
empirical measure araising from a realization of the sequences (Xi , . . . , X„) and (Yi, . . . ,Y„) - see 
appendix|C]- and where M is a known function of d , /„ and '* - see section [TTl 
Consequently, keeping the notations introduced in section 14711 we can perform a statistical test 
of the null hypothesis 



Since, under (//o), we have D^{g^''\f) - 0, then the following theorem provides us with a 
confidence region for this test. 

Theorem 4.1. The set &d is a confidence region for the test of the null hypothesis (Hq) versus 
the alternative (Hi ). 

'Remark 4.1. 1/ If D^(g^'^\ f) — Q, for k < d, then we reiterate the algorithm until g^'^^ is created 
in order to obtain a relationship for the copula of f. 
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2/ If the Ui do not constitute the canonical basis, then keeping the notations introduced in section 
\4.1\ our algorithm meets the test : 

(^o) •■ ^ '^^f = ^ '^^g versus (Hi) : + 7— Q 

dy\...dyd dyi...dyd dyx...dyd dyx...dyd 

Thus, our method enables us to tell wether the copula of f equals the copula ofg in the (a 1 , . . . , 0^) 
basis. 

4.3. With the independent copulas 

Let / be a density on and let X be a random vector with / as density. The objective of this 
section is to determine whether / is the product of its margins, i.e. whether the copula of / is the 
independent copula. Let thus g be an instrumental product of univariate Gaussian density - with 
diag(Var(Xi), Var{Xd)) as covariance matrix and with the same mean as / - as explained at 
section l4~2l let us follow the procedure described at section IZSl i.e. proposition 14. II infers that 
the copula of / is the independent copula when D^{g^''\f) = 0. Thus, we perform a statistical 
test of the null hypothesis : 

(Ho) : / = nf^i/i- versus the alternative (Hi) : f + X^^Ji 

Since, under (Hq), we have D^(g^^\f) - 0, then the following theorem provides us with a 
confidence region for our test. 



Theorem 4.2. Keeping the notations of section \4.2\ the set Sd '■s confidence region for the test 
of the null hypothesis (Hq) versus the alternative (Hi). 

'Remark 4.2. 1/ As explained in section l4!2l if D^(g^''\f) = 0, for k < d, we reiterate the 
algorithm until g^'^^ is created in order to derive a relationship for the copula of f. 
2/ If the ai do not constitute the canonical basis, then keeping the notations introduced in section 
\4.1\ our algorithm meets the test : 

(Hq) : f — n^^[/a^ versus the alternative (Hi) : f + H^^j/a; 

Thus, our method enables us to determine if the the copula of f is the independent copula in the 
(«!,..., ad) basis. 

4.4. Study of the subsequence (g** ') defined by D^(g^'^ /) = for any k' 

Let Q be the set of non-negative integers defined hy Q - {k'^; k'^ - \, k'^ - d, k'. < k'.^^], 
where q - such that ^ < c/ - is its cardinal. In the present section, our goal is to study the 
subsequence (g**^ ^) of the sequence (g^*')jt=i d defined by D^(g^'^ \ /) = for any k' belonging to 
Q. 

First, we have : 

D^(g^''\f)^0 o g''^) ^ f. through propertvlBH 

^ „d T ^ - „d T ^ . as explained in section l4!2l 

^ -'l ^ = s , which amounts to the previous relationship written in the A = 

(ai,...,ad) 
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basis with the notations introduced in section l4^ 
Moreover, defining k'. as the previous integer k'., in the space {!,..., d}, with / > 1, and as 
explained in section IZTl the relationship D^{g^'''\f) = implies that 

fiyi, . . .,yi,Jyu. . . ,yj.,,yk'.^^, . . .,yd) = fi,i+\{yi, ■ ■ ■,ytj 
where f^+i is the density of vector (aJX, ...,01 X) in the A - {a\, . . . , ad) basis. 

Consequently, f(y) = fi^iiyx,- ■ . ,}'j;)./2,3(}'i:^, . --^yi'). ■ ■ fq-i,d(yk'^_^ > ■ ■ -^yk'J- 
Hence, we can infer that 



f(y) fi,2(yu---,yk'J fiAyK^-'-^y-k',) fq-\Ayk'^_^,---,yk') 



KJk(yk) nlJ,(y,) ul,My,) Kk'J^(y^^ 



(4.2) 



The following theorem explicitely describes the form of the / copula in the A - (ai, . . . ,ad) 
basis : 

Theorem 4.3. Defining C f. . as the copula of fij and keeping the notations introduced in sections 
^J}and^3\ it holds 

'd^^Zdiff ^ dy,...dr,f^^'--dy,-...dy-,,f^'-' " ' " . . . 

"Remark 4.3. If there exists i such that i < d and k'. — k'.^^^, then the notation fj+iiyt, ■ ■ ■,yh j) 
means fk'(yk')- Thus, if for any k, we have D^(g^''\ f) = 0, then, for any i < d, wehavek'- = k'-^y 
i.e. we have f — IVl^^fkiyk) - where fk is the k''^ marginal density of f. 

At present, using relationship l4.2l and remark |431 the following corollary gives us the copula 
of / as equals to 1 in the {a\,. . . , aj] basis when, for any k, D^{g^'' \ f) - : 

Corollary 4.1. In the case where, for any k, D^(g^''\ /) - 0, it holds: 

dyi...dyd ^ 



5. Simulations 



Let us examine two simulations and an application to real datasets. The first simulation 
studies the elliptical copula and the second studies the independent copula. In each simulation, 
our program will aim at creating a sequence of densities (g*^'), j = 1, ..,d such that g*^*'^ = g, 
gU) - g^-'^^^fa./[g^-'^^^]aj and D^(g^''\f) = 0, where D^ is a divergence - see appendix iBl for its 
definition - and 

aj - argmfhD^ig'-'^^^fblgl ''',/), for all j- 1, c/. We will therefore perform the tests intro- 
duced in theorems l4.1l and l4~2l 

Simulation 5.1. 

We are in dimension 2(=d), and we use the divergence to perform our optimisations. Let us 
consider a sample of 50(=n) values of a random variable X with a density law / defined by : 



f{x) - CpiFcumbeiixi), F Exponentiai{x2)).Gumbel{x\) .ExponentiaKxi) 
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where : 

• c is the Gaussian copula with correlation coefficient p - 0.5, 

• the Gumbel distribution parameters are -1 and 1 and the exponential density parameter is 2. 
Let us generate then a Gaussian random variable Y with a density - that we will name g - pre- 
senting the same mean and variance as /. 

We theoretically obtain k-2 and (oi, 02) - ((1,0), (0, 1)). 
To get this result, we perform the following test: 

(i/o): (fli,fl2) = ((1,0), (0,1)) versus (i/i): (oj, 02) ^ ((1, 0), (0, 1)) 

Then, theorem|4T|enables us to verify {Hq) by the following 0.9(=a) level confidence ellipsoid 

&2^[be {Varf{M{b,b))f-^l^^¥„M{b,b) < ^^*°''V ^ 0,2533/7.0710 = 0.03582) 

And, we obtain 



Table 1 : Simulation 1 : Numerical results of the optimisation 



Our Algorithm 




minimum : 0.445199 


Projection Study : 


at point: (1.0171,0.0055) 




P- Value : 0.94579 


Test : 


H\ : a\ i &\ : True 




minimum : 0.009628 


Projection Study 1 : 


at point: (0.0048,0.9197) 




P- Value : 0.99801 


Test : 


//() : fl2 £ £2 : True 


;if'^(Kernel Estimation of g^^\ g'-^'') 


3.57809 



Therefore, we can conclude that //q is verified. 
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Figure 1: Graph of the estimate of (xi , X2) Cp(Fau,„bel(xi), FExponemialiM))- 




Simulation 5.2. 

We are in dimension 2(=d), and we use the;^^^ divergence to perform our optimisations. 

Let us consider a sample of 50(=n) values of a random variable X with a density law / defined 

by 

f{x) — Gumbel{xi).Exponential{x2), 
where the Gumbel distribution parameters are -1 and 1 and the exponential density parameter is 
2. 

Let g be an instrumental product of univariate Gaussian densities - with diag(Var(Xi), Var(Xd)) 
as covariance matrix and with the same mean as /. 

We theoretically obtain k - 2 and (01,02) - ((1,0), (0, 1)). To get this result, we perform the 
following test: 

(i/o): (01,02) = ((1,0), (0,1)) versus (//i): (oi, 02) ^ ((1, 0), (0, 1)). 
Then, theorem|42]enables us to verify (i/o) by the following 0.9(=a) level confidence ellipsoid 

£2 = e {Varp{M{b,b))i-^^^^¥„M{b,b) < q^^^'^^ I yJTi ^ 0.03582203). 
And, we obtain 
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Table 2: Simulation 2 : Numerical results of the optimisation 



Our Al^ 


;orithm 








minimum : 0.057833 


Projection Study : 




at point: (0.9890,0.1009) 
P- Value: 0.955651 


Test : 




H\ : a\ i &\ : True 






minimum : 0.02611 


Projection Study 1 : 




at point: (-0.1105,0.9290) 
P- Value: 0.921101 


Test : 




//() : True 


;^f'^(Kernel Estimation of g'-^\ 




1.25945 



Therefore, we can conclude that / = n^=j/j. 



Figure 2: Graph of the independent copula estimate. 




5.0.7. Application to real datasets 

Let us for instance study the moves in the stock prices of Renault and Peugeot from January 
4, 2010 to July 25, 2010. We thus gather 140(=n) data from these stock prices - see data below. 
Let us also consider Xi (resp. Xi) the random variable defining the stock price of Renault (resp. 
Peugeot). We will assume - as it is commonly done in mathematical fina nce - that the stock 



marke t abides by the classical hypotheses of the Black-Scholes model - see iBlack and Scholes 
(Il973h . 

Consequently, X\ and X2 each present a log-normal distribution as probability distribution. 
Let / be the density of vector {ln{X\),ln{X2)), let us now apply our algorithm to / with the 
Kullback-Leibler divergence as 0-divergence. Let us generate then a Gaussian random variable 
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Y with a density - that we will name g - presenting the same mean and variance as /. 
We first assume that there exists a vector a such that D^(gj-, f) - 0. 

In order to verify this hypothesis, our reasoning will be the same as in Simulation 15. II Indeed, 
we assume that this vector is a co-factor of /. Consequently, corollarv 13 .2l enables us to estimate 
a by the following 0.9(=a) level confidence ellipsoid 

6i^{be R2; (Varp(M(b,b)))^-^^^^PnM(b,b) < ^^'^"•'V V" ^ 0,2533/ Vl40 = 0.02140776}. 
And, we obtain 



Table 3: Numerical results : First projection 



Our Algorithm 




minimum : 0.02087685 


Projection Study : 


at point : fli=(19.1,-12.3) 




P- Value : 0.748765 


Test : 


//() : fli e fii : True 


K(Kemel Estimation of g*'*, g^'^) 


4.3428735 



Therefore, our first hypothesis is confirmed. 

However, our goal is to study the copula of (ln(X[), IniXj))- Then, as explained in section l4!4l we 
formulate another hypothesis assuming that there exists a vector a such that D^(g^^^^,f) - 0. 
In order to verify this hypothesis, we will use the same reasoning as above. Indeed, we assume 
that this vector is a co-factor of /. Consequently, corollarv I3.2l enables us to estimate a by the 
following 0.9(=Q') level confidence ellipsoid 

&2^{be (Varj.(M(b,b))Y-^^^'>V„M(b,b) < ^^*°''V V" ^ 0,2533/ VT40 = 0.02140776}. 
And, we obtain 



Table 4: Numerical results : Second projection 



Our Algorithm 




minimum : 0.0198753 


Projection Study 1 : 


at point : fl2=(8. 1,3.9) 




P- Value : 0.8743401 


Test : 


//() : fl2 e £2 : True 


K(Kemel Estimation of g^^\ g*-^^) 


4.38475324 



Therefore, our second hypothesis is confirmed. 

In conclusion, as explained in corollarv l4.1l the copula of / is equal to 1 in the {01,02} basis. 



16 



Figure 3: Graph of the copula of ln(X2)) in the canonical basis. 




Figure 4: Graph of the copula of {ln{X\ ), ln(Xi)) in the [a\,a2] basis. 
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Figure 5: Graph of the copula of (ln(Xi), IMX^)) in the {ai, 02) basis - other view. 




- 6 
4 
2 


Log of Peugeot stock price 2 0.4 g.e 08 

Log of Renault stock price 



Table 5: Stock prices of Renault and Peugeot 



Date 


Renault 


Peugeot 


Date 


Renault 


Peugeot 


Date 


Renault 


Peugeot 


23/07/10 


34.9 


24.2 


22/07/10 


34.26 


24.01 


21/07/10 


33.15 


23.3 


20/07/10 


32.69 


22.78 


19/07/10 


33.24 


23.36 


16/07/10 


33.92 


23.77 


15/07/10 


34.44 


23.71 


14/07/10 


35.08 


24.36 


13/07/10 


35.28 


24.37 


12/07/10 


33.84 


23.16 


09/07/10 


33.46 


23.13 


08/07/10 


33.08 


22.65 


07/07/10 


32.15 


22.19 


06/07/10 


31.12 


21.56 


05/07/10 


30.02 


20.81 


02/07/10 


30.17 


20.85 


01/07/10 


29.56 


20.05 


30/06/10 


30.78 


21.07 


29/06/10 


30.55 


20.97 


28/06/10 


32.34 


22.3 


25/06/10 


31.35 


21.68 


24/06/10 


32.29 


22.25 


23/06/10 


33.58 


22.47 


22/06/10 


33.84 


22.77 


21/06/10 


34.06 


23.25 


18/06/10 


32.89 


22.7 


17/06/10 


32.08 


22.31 


16/06/10 


31.87 


21.92 


15/06/10 


32.03 


22.12 


14/06/10 


31.45 


22.2 


11/06/10 


30.62 


21.42 


10/06/10 


30.42 


20.93 


09/06/10 


29.27 


20.34 


08/06/10 


28.48 


19.73 


07/06/10 


28.92 


20.15 


04/06/10 


29.19 


20.27 


03/06/10 


30.35 


20.46 


02/06/10 


29.33 


19.53 


01/06/10 


28.87 


19.45 


31/05/10 


29.39 


19.54 


28/05/10 


29.16 


19.55 


27/05/10 


29.18 


19.81 


26/05/10 


27.5 


18.5 


25/05/10 


26.76 


18.08 


24/05/10 


28.75 


18.81 


21/05/10 


28.78 


18.82 


20/05/10 


28.53 


18.84 


19/05/10 


29.49 


19.25 


18/05/10 


30.95 


19.76 


17/05/10 


30.92 


19.35 


14/05/10 


31.35 


19.34 


13/05/10 


33.65 


20.76 


12/05/10 


33.63 


20.52 


11/05/10 


33.38 


20.34 


10/05/10 


33.28 


20.3 


07/05/10 


31 


19.24 


06/05/10 


32.4 


20.22 


05/05/10 


32.95 


20.45 


04/05/10 


33.3 


21.03 


03/05/10 


35.58 


22.63 


30/04/10 


35.41 


22.45 


29/04/10 


35.53 


22.36 


28/04/10 


34.75 


22.33 
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Table 6: Stock prices of Renault and Peugeot (continued) 



Date 


Renault 


Peugeot 


Date 


Renault 


Peugeot 


Date 


Renault 


Peugeot 


27/04/10 


36.2 


22.9 


26/04/10 


37.65 


23.73 


23/04/10 


36.72 


23.5 


22/04/10 


34.36 


22.72 


21/04/10 


35.01 


22.86 


20/04/10 


35.62 


22.88 


19/04/10 


34.08 


21.77 


16/04/10 


34.46 


21.71 


15/04/10 


35.16 


22.22 


14/04/10 


35.1 


22.22 


13/04/10 


35.28 


22.45 


12/04/10 


35.17 


21.85 


09/04/10 


35.76 


21.9 


08/04/10 


35.67 


21.67 


07/04/10 


36.5 


21.89 


06/04/10 


36.87 


22 


01/04/10 


35.5 


21.97 


31/03/10 


34.7 


21.8 


30/03/10 


34.8 


22.24 


29/03/10 


35.7 


22.73 


26/03/10 


35.54 


22.58 


25/03/10 


35.53 


22.73 


24/03/10 


33.8 


21.82 


23/03/10 


34.1 


21.58 


22/03/10 


33.73 


21.64 


19/03/10 


34.12 


21.68 


18/03/10 


34.44 


21.75 


17/03/10 


34.68 


21.98 


16/03/10 


34.33 


21.88 


15/03/10 


33.57 


21.53 


12/03/10 


33.9 


21.86 


11/03/10 


33.27 


21.58 


10/03/10 


33.12 


21.47 


09/03/10 


32.69 


21.54 


08/03/10 


32.99 


21.66 


05/03/10 


32.89 


21.85 


04/03/10 


31.64 


21.26 


03/03/10 


31.65 


20.7 


02/03/10 


31.05 


20.2 


01/03/10 


30.26 


19.54 


26/02/10 


30.2 


19.39 


25/02/10 


29.42 


18.98 


24/02/10 


30.9 


19.49 


23/02/10 


30.54 


19.74 


22/02/10 


31.89 


20.06 


19/02/10 


32.29 


20.67 


18/02/10 


32.26 


20.41 


17/02/10 


31.69 


20.31 


16/02/10 


31.08 


19.8 


15/02/10 


30.25 


19.66 


12/02/10 


29.56 


19.57 


11/02/10 


31 


20.4 


10/02/10 


32.78 


21.21 


09/02/10 


33.31 


22.31 


08/02/10 


32.63 


21.95 


05/02/10 


32.15 


22.33 


04/02/10 


33.72 


22.86 


03/02/10 


35.32 


23.93 


02/02/10 


35.29 


23.8 


01/02/10 


35.31 


24.05 


29/01/10 


34.26 


23.64 


28/01/10 


33.94 


23.31 


27/01/10 


33.85 


23.88 


26/01/10 


34.97 


24.86 


25/01/10 


35.06 


24.35 


22/01/10 


35.7 


24.95 


21/01/10 


36.1 


25 


20/01/10 


36.92 


25.35 


19/01/10 


38.4 


25.81 


18/01/10 


39.28 


25.95 


15/01/10 


38.6 


25.7 


14/01/10 


39.56 


26.67 


13/01/10 


39.49 


26.13 


12/01/10 


38.36 


25.98 


11/01/10 


39.21 


26.65 


08/01/10 


39.38 


26.5 


07/01/10 


39.69 


26.7 


06/01/10 


39.25 


26.32 


05/01/10 


38.31 


24.74 


04/01/10 


38.2 


24.52 









Critics of the simulations 

In the case where / is unknown, we will never be sure to have reached the minimum of the 
(^-divergence: we have indeed used the simulated annealing method to solve our optimisation 
problem, and therefore it is only when the number of random jumps tends in theory towards 
infinity that the probability to get the minimum tends to 1 . We also note that no theory on the 
optimal number of jumps to implement does exist, as this number depends on the specificities of 
each particular problem. 

Moreover, we choose the 50 ~ for the AMISE of the two simulations. This choice leads us to 



simulate 50 random variables - see lScottl(ll992h page 151 -, none of which have been discarded 
to obtain the truncated sample. 

This has also been the case in our application to real datasets. 

Finally, the shape of the copula in the case of real datasets in the {ai, 02) basis is also noteworthy. 
Figure|4]shows that the curve reaches a quite wide plateau around 1, whereas Figure|5]shows that 
this plateau prevails on almost the entire [0, 1]^ set. We can therefore conclude that the theoritical 
analysis is indeed confirmed by the above simulation. 
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Conclusion 



Projection Pursuit is useful in evidencing characteristic structures as well as one-dimensional 
projections and their associated distribution in multivariate data. This article clearly evidences 
the efficiency of the i^-projection pursuit methodology for goodness-of-fit tests for copulas. In- 
deed, the robustness as well as the convergence results we achieved, convincingly fulfilled our 
expectations regarding the methodology used. 

A. On the different families of copula 

There exists many copula families. Let us here present the most important amongst them. 

A.l. Archimedean copulas 

These copulas present a simple form with properties such as associativity and have a variety 
of dependent structures. They can generally be defined under the following form 



where {ui,U2, ...,Un)€ [0, 1]" and where *P is known as a "generator function". This *P function 
must be at least d -2 times continuously differentiable, must have a decreasing and convex d-2 
derivative, and must be such that ^(1) - 0. 
Let us now present several examples : 

1/ Clayton copula: 

The Clayton copula is an asymmetric archimedean copula, exhibiting greater dependency in the 
negative tail than in the positive tail. Let us define X (resp. F) as the random vector having F 
(resp G) as cumulative distribution function (CDF). Assuming that the vector (X, Y) has a Clayton 
copula, then this copula is given by: 



For = 0m the Clayton copula, the random variables are statistically independent. The generator 
function approach can be extended to create multivariate copulas, simply by including more 
additive terms. 

2/ Gumbel copula: 

The Gumbel copula (a.k.a. Gumbel-Hougard copula) is an asymmetric archimedean copula, 
exhibiting greater dependency in the positive tail than in the negative tail. This copula is given 



/ n 



H(ui, U2 



H{x,y) = {F{xf + G{y)' -I) 



And its generator is: 



by: 



'P(x) = (-InCx))" 



3/ Frank copula: 

The Frank copula is a synnmetric archimedean copula given by: 




20 



A.2. Periodic copula 

In 2005. lAurelien Alfonsi and Damiano Brigd (12005 ') introduced a way of constructing cop- 



ulas based on periodic functions. Defining h (resp. H) as a 1 -periodic non-negative function that 
integrates to 1 over [ 0, 1] (resp. as a double primitive of h), then both 

Hiu + v)- H{u) - H{v) and - H{u - v) + H{u) + H{-v) 

are copula functions, the second one not being necessarily exchangeable. 

B. ^-Divergence 

Let us call hg the density of a^Z if h is the density of Z. Let tfhe a strictly convex function 
defined by <^ : M+ -» M+, and such that ip(l) - 0. 

Definition B.l. We define a (f>— divergence of P from Q, where P and Q are two probability 
distributions over a space Q such that Q is absolutely continuous with respect to P, by 

D^(Q,P)^ j Vi^)dP- (B.l) 

The above expression iB.l\l is also valid if P and Q are both dominated by the same probability. 
The most used distances (Kullback, Hellinger or;^'^) belong to the Cressie-Read fam ily (see 



Cressie-R e ad ( 1984), Cs iszar I. (1967) and the books of Friedrich and Igor ( 1987) . Pardo Leandro 
(l2006l) and lZografos K. They are defined by a specific tp. Indeed, 



- with the Kullback-Leibler divergence, we associate ip(x) - xln(x) - x + 1 

- with the Hellinger distance, we associate if{x) - 2( y/x - 1)^ 

- with the;^'^ distance, we associate (f(x) - j{x- 1)^ 

- more generally, with power divergences, we associate if(x) - ^-^^^^Yf^^ where y e R \ (0, 1) 

- and, finally, with the L' norm, which is also a divergence, we associate (f(x) - \x - 1|. 
Let us now present some well-known properties of divergences. 

Property B.l. We have D^(P, g) = o P = g. 

Property B.2. The divergence function Q i-> D^{Q, P) is 

• convex, 

• lower semi-continuous, for the topology that makes all the applications of the form Q ^ J fdQ 
continuous where f is bounded and continuous, and 

• lower semi-continuous for the topology of the uniform convergence. 

Finally , we will also use the follo wing property derived from the first part of corollary (1.29) 



page 19 of lFriedrich and Igoii(ll987b . 



Property B.3. 

IfT ■.{X,A)^ (Y, B) is measurable and if D^(P, Q) < oo, then D^iP, Q) > D^(PT-\ QJ-^) with 
equality being reached when T is surjective for {P, Q). 
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C. Study of the sample 



Let Xi, X2,..,X,„ be a sequence of independent random vectors with same density /. Let Yi, 
Y2,..,Y„i be a sequence of independent random vectors with same density g. Then, the kernel 
estimators /„,, g„, fa,,„ and ,„ of /, g, fa and ga, for all a e Wl, almost surely and uniformly 
converge since we ass ume that the bandwidth h„, of these estimators meets the following condi- 
tions (see lBosql ( Il999l) ): 

{"T-lyp): h,n \,n 0, m/z„, oo, mhmlUh^) oo and L(h^)ILLm ->„ oo, 
with L{u) - ln{u V e). 
Let us consider 

« i=l' ga,H(«^i) /ii(^i) g<i,(i(C' ^\ ' / „ (=17' IT f,i(Xi " 

Our goal is to estimate the minimum of D^{gj-, f). To do this, it is necessary for us to truncate 
our samples: 

Let us consider now a positive sequence 0,„ such that 0,„ — > 0, ym/d^ — > 0, where y,„ is the 



almost sure convergence rate of the kernel density estimator - ym - Op(m **'<), see lemma |R3] - 
y^m IS^n ^ 0, where y^l^ is defined by 

, I (2) (2) 

for all in and all jc in R , and finally — > 0, where v,, is defined by 

I ,,gmix) fh,m(Vx) ,.gix) fbib'^x) „) 

fm(x) gb.mib X) fix) gh(b^x) 

for all b in and all x in R''. 

We will generate f,,, g„, and ,„ from the starting sample and we will select the Xi and Yj vectors 

such that fmiXi) > 6„, and gb,m(b'^Yi) > 0,„, for all / and for all b € R;f. 

The vectors meeting these conditions will be called Xi,X2, ■■■,X„ and Yi, Y2, Y„. 

Consequently, the next proposition provides us with the condition required for us to derive our 

estimates: 



Proposition C.l. Using the notations introduced in mroniatowski and in section li.il it 

holds 

lim„^oo sup^gjjrf \{Bi(n,a) - B2(n,a)) - D^(g^^,f)\ = 0. 
"Remark C.l. With the Kullback-Leibler divergence, we can take for 0,„ the expression m^^, with 

D. Hypotheses' discussion 

D.l. Discussion of (HI). 

Let us work with the Kullback-Leibler divergence and with g and ai. 

For all b € Ri we have / ^*(^'(f^^f§^))f(x)dx = / (f||g^ - l)f(x)dx = 0, since, for any 

i> in R,, the function x i-> g(x) -^ ..^ \ is a density. The complement of * in Ri is and then 

the supremum looked for in R is -00. We can therefore conclude. It is interesting to note that we 
obtain the same verification with /, g**^"'* and a^. 
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D.2. Discussion of(H3). 

This hypothesis consists in the following assumptions: 

• We work with the Kullback-Leibler divergence, (0) 

• We have f{./ajx) — g{./ajx), i.e. K(gj-^,f) — Q - we could also derive the same proof with f, 
g^*"'' and ak- (1) 

Preliminary {A): Shows that A = {{c,x) e Ri\{a,}xRd; > j^^, g(x)f^^ > f(x)} = 

through a reductio ad absurdum, i.e. if we assume A ^ d). 
Thus, our hypothesis enables us to derive 

fix) = f{./ajx)fai(ajx) = g{./ajx)fa,iajx) > g{./c'^x)fc{c'^x) > f 

'^"''^ - IS implies g(./fl>)/«,(fl7x) = > g(x)f^^ = g(./c^x)fAc^x), 

i.e. f > f. We can therefore conclude. 

Preliminary (B): Shows that B = {(c,x) e Ri\{a,}xRd; < f^^, g(x)^^ < f(x)} = 

through a reductio ad absurdum, i.e. if we assume B ^ (D. 
Thus, our hypothesis enables us to derive 

fix) = f{./ajx)fa,{ajx) = g{./ajx)fa,iajx) < gi./c^x)fc{c^x) < f 
We can therefore conclude as above. 
Let us now verify (//3): 

WehavePM(c,fli)-PM(c,fl) = / ln( f^^ff^l ){ - )g{x)dx. Moreover, the logarithm 

In is negative on [xeWi; < 1) and is positive on [x e ; ^ 

Thus, the prehminary studies (A) and (B) show that and { ["vf) - f^^r^} always 

present a negative product. We can therefore conclude, since (c,a) i-> PM(c,a\) - PM(c,a) is 
not null for all c and for all a - with a + a\. 



E. On Ruber's algorithms 



In th e present appe ndix, let us now first present the projection pursuit methodologies intro- 
duced bv lHubeii (Il985h . Secondly, we will show that our method encompasses Hubers' . 



E.l. Ruber's analytic approach 

Let / be a density on R''. We define an instrumental density g with the same mean and 
variance as /. Huber's methodology requires us to start with performing the K(f,g) - test - 
with K being the Kullback-Leibler divergence. Should this test turn out to be positive, then f = g 
and the algorithm stops. If the test were not to be verified, the first step of Huber's algorithm 
would amount to defining a vector a i and a density /^'' by 

ai = arg inf K(f^,g) md f'^ = (E.l) 

where is the set of non null vectors of W' and (resp. gg) stands for the density of o^X (resp. 
a^Y) when / (resp. g) is the density of X (resp. Y). More exactly, this results from the maximi- 
sation of a K{fa,ga) since K(f,g) - K{fa,ga) + K{fj-,g) and it is assumed that K(f,g) is 
finite. In a second step, Huber replaces / with and goes through the first step again. 
By iterating this process, Huber thus obtains a sequence (a\,a2, ...) of vectors of and a se- 
quence of densities f^'K 
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'Remark E.l. This algorithm stops when the Kullhack-Leibler divergence equals zero or when it 

reaches the d''' iteration. We then obtain an approximation of f from g : 

When there exists an integer j such that K{f^\g) = with j < d, he obtains = g, i.e. 

/ = gll^^j-j— since by induction /'■'^ = f^\^i-^- Similarly, when, for all j, Huber gets 

K{f-^^, g)>0 with j < d, he assumes g = f^^ in order to derive f = ^11^^ -|— . 
Finally he obtains Kif°\ g) > Kif^\ g)> > with = f. 

E.2. Huber 's synthetic approach 

Keeping the notations of the above section, we start with performing the K{f, g) = test; 
should this test turn out to be positive, then f - g and the algorithm stops, otherwise, the first 
step of his algorithm would consist in defining a vector ai and a density g^^^ by 

ai = arg inf Kif,g^) mdg^'^ = g^ (E.2) 

ami ga ga, 

More exactly, this optimisation results from the maximisation of a K(fa,ga) since K(f,g) = 
K(fa, ga) + K(f, gj-) and it is assumed that K{f, g) is finite. In a second step, Huber replaces g 
with g^^^ and goes through the first step again. By iterating this process, Huber thus obtains a 
sequence {ai,a2, ...) of vectors of and a sequence of densities g*''. 

'Remark E.2. First, in a similar manner to the analytic approach, this methodology enables us 
to approximate f from g: 

To obtain an approximation of f, Huber either stops his algorithm when the Kullback-Leibler 

divergence equals zero, i.e. K{f,g^j^) = implies g^^'^ - f with j < d, or when his algorithm 
reaches the d'^ iteration, i.e. he approximates f with g^'^K 
Second, he gets Kif,g^'^^) > Kif,g^^^) > > with = g. 

E.3. The first co-vector of f simultaneously optimizes four problems 

Let us first study Huber's analytic approach. 

Let 'R' be the class of all positive functions r defined on R and such that f{x)r^\a^x) is a density 
on M.'' for all a belonging to R^. The following proposition shows that there exists a vector a 
such that Y minimizes K(fr^^,g) in r. 

Proposition E.l (Analytic Approach). There exists a vector a belonging to R^ such that 
arg minr^K' K(fr-\g) = |, andr(a^x) = ^ andK{f,g) = K{fa,ga) + K(ff^,g). 

Let us also study Huber's synthetic approach: 

Let "R be the class of all positive functions r defined on R and such that g(x)r(a^x) is a density on 
R'' for all a belonging to Rf . The following proposition shows that there exists a vector a such 
that — minimizes K(gr, f) in r. 

Proposition E.2 (Synthetic Approach). There exists a vector a belonging to Rf such that 
argxmnr^nK{f,gr) = a«dr(a^x) = ^ andK{f,g) = K{fa,ga) + Kif,g^). 
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To recapitulate, the choice of r - j- enables us to simultaneously solve the following three 
optimisation problems, for a e M^: 

First, find a such that a = arginf^^gd K(fj-,g) - analytic approach - 

Second, find a such that a - arginf^^^d K(f,gj-) - synthetic approach - 

Third, find a such that a - arginf^^^d D^(gj-,f) - our method. 
We can therefore state that the methodology we introduced in the present article encompasses 
Hubers'. 



F. Proofs 

Proof of propositions lE.ll and lE.2[ Let us first study proposition |E2] 

Without loss of generality, we will prove this proposition with x\ in lieu of a^X. 

Let us define g* - gr. We remark that g and g* present the same density conditionally to xi. 

Indeed, 

gj(xi) = J g*(x)dx2...dxd = / r(xi)g(x)dx2...dxd = J g(x)dx2...dxd = r{xi)gi{x]). 
Thus, we can demonstrate this proposition. 

We have g{.\x\) - and g\{xi)r{xi) is the marginal density of g* . Hence, 

/ g'dx = / gx{x\)r{x\)g{.\x\)dx = / g{-\xi)dx2..dxd)dxi = J fi{xi)dxi = 1 and 

since g* is positive, then g* is a density. Moreover, 

K{f,g') = ^ f\ln{f)-ln{g')]dx, (F.l) 

= J f\ln(f(.\xi)) - ln{g'{.\x,)) + In(Mxi)) - ln(gi(xi)r(xi))}dx, 

= J /{/«(/(.|^i))-M^(.|xi)) + /«(/i(xi))-/«(gi(xiMxi))Wx, (F.2) 

as g*{.\xi) = g{-\xi). Since the minimum of this last equation ( IF.2I I is reached through the min- 
imization of J f{ln(fi(xi)) - ln(gi(xi)r(xi))}dx - K(fi,gir), then propertv IB. 1 I necessarilv im- 
plies that /i = gir, hence r - f\lgi. 

Finally, we have g*) = ^ f{ln{f\{xi))-ln{g\{x\)))dx - which completes 

the demonstration of proposition |R2] 

Similarly, if we replace /* = /r ' with / and g with g*, we obtain the proof of proposition lE.il 

□ 

Proof of proposition ISTTl The demonstration is also very similar to the one for proposition |Ej2l 
save for the fact we now base our reasoning at row F. 1 1 on K(g*,f) - J g*{ln(f) — ln(g*)}dx 
instead of K{f, g") = / /{/«(/) - ln(g*)}dx. □ 
Proof of lemma IF.ll 

lemme F.l. We have @^{be@\ f ^^^C^ - i)f(x)dx < oo). 

We get the result since / (flf^ - l)/(x) dx ^ ^ (^^i^ " /(^)) dx^Q. □ 
Proof of proposition lF.il 

Proposition F.l. In the case where f is known and keeping the notations introduced in section 
13.71 as well as assuming (HI) to (H3) hold, then both sup^^^g l|c„(fl) - fljt|| andy„ tend to a^ a.s. 
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In the same manner as in Proposition 3.4 of 
through lemma IfTI 



Broniatowski 



C 



20Q9h . we prove this proposition 

□ 



Proof of proposition [331 Proposition l3.3l comes immediately from proposition lC.ll pagel22land 
lemma lFTI pagelZSl □ 
Proof of theorem 13.11 We prove this theorem by induction. First, by the very definition of 
the kernel estimator ^Jf* = g„ converges towards g. Moreover, the continuity of a i-> /a_„ and 
a i-> ga„ and proposition l3.3l implv that ^5/' = gf^^^ converges towards g^^K Finally, since, for 

any k, gf^ - ''-^^7, we conclude similarly as for gl^\ □ 
Proof of lemma IF.2I 



lemme F.2. We have g(./ajx, a^x) — n(aJ^jX, a^x) — f(./ajx, ajjc). 

Putting A - (ai, .., ad), let us determine / in basis A. Let us first study the function defined by 

can immediately say that i// is continuous and since A is a 
basis, its bijectivity is obvious. Moreover, let us study its Jacobian. 



By definition, it is J^(xi, . . ., Xd) - 





#1 






dx\ 


dxd 




fli.i ■ 




dij/d 


dipd 




flrf.i ■ 


ad,d 


dx\ 


dxd 







- \A\ since A is 



a basis. We can therefore infer : Vx G W', 3\y e W' such that /(jc) = \A\-^^'(y), i.e. *P (resp. 
y) is the expression of / (resp of x) in basis A, namely ^{y) = nCv;+i, ■.■,yd)hiyi, ■■■,yj), with n 
and h being the expressions of n and h in basis A. Consequently, our results in the case where 
the family {fly)i<j<d is the canonical basis of R'', still hold for ^' in basis A - see section ITTI And 
then, if g is the expression of g in basis A, we have g(./yi, ...,}'/) - Hyj+u ■■■,yd) - ^(-/yu ■■■,yj), 
i.e. g(./aJx, a]x) — n(aT x, ajjc) — f(./aJx, a~^x). □ 

Proof of lemma IF.3I 

2 

lemme F.3. For any continuous density f, we have y„, — \ fm(x) — f(x)\ — Op(m^~). 

Defining b,„(x) as b,„(x) = \E(f„,(x)) - f(x)\, we have y,„ < \ f,„(x) - E(f,„(x))\ + b,„(x). More- 
over, from page 150 of Scott (1992), we derive that bm(x) - Op(l,''j^^hj) where hj - Op{m^~). 

Then, we obtain b,„(x) = Op(m^^). Finally, since the central Umit theorem rate is Op(m^^), we 
infer that y,,, < Op(m^^) + 0p{m^7;^) = Op(m^*^). □ 
Proof of lemma IF.4[ 

lemme F.4. Let f be an absolutely continuous density, then, for all sequences (a„) tending to a 
in Mf, sequence fa,^ uniformly converges towards fa. 

Proof. For all a in , let Fa be the cumulative distribution function of a^X and be a complex 
function defined by ij/a{u, v) - FaCReiu + iv)) + iFaCReiv + iu)), for all u and v in M. 
First, the function (/'^(m, v) is an analytic function, because x i-> fa(a~'^x) is continuous and as 
a result of the corollary of Dini's second theorem - according to which "A sequence of cumu- 
lative distribution functions, which pointwise converges on M towards a continuous cumulative 
distribution function F on R, uniformly converges towards F on M"- we deduct that, for all se- 
quences ia„) converging towards a, tj/a„ uniformly converges towards ij/a. Finally, the Weierstrass 
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theorem, (see proposal (10.1) page 220 of the "Calcul infinitesimal" book of Jean Dieudonne), 
implies that all sequences (/r^,, uniformly converge towards i/'^,, for all a,, tending to a. We can 
therefore conclude. □ 

Proof of lemma iRSl By definition of the closure of a set, we have 

lemme F.5. The set Tc is closed in L} for the topology of the uniform convergence. 

Proof of lemma lR6l Since K is greater than the L' distance, we have 

lemme F.6. For all c > 0, we have Yc c Bi\{f,c), where Bi\{f,c) - {p e L^\\\f - p\\i < c). 

Proof of lemma iRTl The definition of the closure of a set and lemma lF!4l (see pagel26ll imply : 

lemme F.7. G is closed in L' for the topology of the uniform convergence. 

Proof of lemma IF.8I 

lemme F.8. inf^^j^j D^{g*,f) is reached when the (p-divergence is greater than the L' distance 
as well as the L? distance. 

Proof Indeed, let G be [g^; a e Rf) and T, be = [p; K(p,f) < c) for all oO. From 
lemmas lF31IF.6l and lF.7l (see pagel27]i. we get PiG is a compact for the topology of the uniform 
convergence, if F^ n G is not empty. Hence, and since propertv lB.2l (see pagelSTt implies that 
Q i-> D^(Q, P) is lower semi-continuous in L' for the topology of the uniform convergence, then 
the infimum is reached in L\ (Taking for example c = D^{g,f), Q is necessarily not empty 
because we always have D^{gj-,f) < D^(g, f)). Moreover, when the ^-divergence is greater 
than the distance, the very definition of the space enables us to provide the same proof as 
for the L' distance. □ 

Proof of lemma IF.9i 

lemme F.9. For any p < d, we have g^j'^ = ga^,. 

Assuming, without any loss of generality, that the a,, i - 1, .., p, are the vectors of the canon- 
ical basis, since g^''-^\x) = gWf^l^^- f'l';^'"') we derive immediately that g*r" = gp- We 
note that it is sufficient to operate a change in basis on the a,- to obtain the general case since 
A - (fl,) is a basis - see lemma IfToI □ 
Proof of lemma EH 

lemme F.IO. If there exists p, p < d, such that D^{g^P\f) — 0, then the family of {ai)i=\^,,^p is 
free and is orthogonal. 

Without any loss of generality, let us assume that p - 2 and that the a,- are the vectors of 
the canonical basis. Using a reductio ad absurdum based on the hypotheses ai = (1,0, ...,0) 
and fl2 - (», 0, 0), where a e ]R, we get g'-^^x) - g{x2, .., Xd/xi)fi{xi) and / = g^^\x) - 
gix2,..,Xd/xi)fi(xi) J°,][^"l^l^) .Hence f{x2,..,Xd/xi) = g{x2,..,Xd/xi) J'il^^'^^l^^ ■ It consequently 

implies that /„fl,(axi) = [g^^^aM^i) 

since 1 - J f(x2,..,Xd/x\)dx2...dxd — J g(x2, .., x^ / xi)dx2...dxij ^^^^^ 

[S"']™,(Q'A-,)- 

Therefore, g'-^^ - g^^\ i.e. p - I which leads to a contradiction. Hence, the family is free. 
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Moreover, using a reductio ad absurdum, we get the orthogonality. Indeed, we have 
J f(x)dx - \ +00 - J n(a^^^x, ajx)h(ajx, a~^x)dx. The use of the same argument as in 
the proof of lemma IF. 1 II enables us to infer the orthogonality of (a,)/=i,..,p- ^ 
Proof of lemma IrTTI 

lemme F.ll. Should there exist a family (01)1=1, j such that f(x) — n(aj^^x, ajx)h(ajx, ajx), 
with j < d, with f, n and h being densities, then this family is an orthogonal basis ofW'. 

Using a reductio ad absurdum, we have J f(x)dx - 1 +00 - J niaj^^x, ...,a^x)h(ajx, ...,ajx)dx. 
We can therefore conclude. □ 
Proof of proposition l4.1l 

Through lemma lF?TOl we can consequently infer that (oj, ad) is a basis of R"'. Let us now write 
/ in the A system. Let us first study the function defined by t/f : R'' — » R'', x i-> (ajx, .., ajjc). 
We can say t// is continuous and since A is a basis, its bijectivity is obvious. Let us also study its 

#1 dtlri 

dx\ dxk 



Jacobian. By definition, it is J,p(xi, . . .,Xk) — 



= |A| ^ 



dx\ dxk 

since A is a basis. Thus, we can infer that, in basis A, the writing of / (resp. x) exists and is 
unique. Defining *F (resp. y) as this new form of / (resp. x), we have fix) - |A|"'*P(y) (resp. 
(aJx, ..,a^x) - (yi, Similarly, let us define (resp. "P^'*) as being the form of g (resp. 

g^'^) in basis A, we also have *F(y) = |A|^(;ic) 
(resp. *®(y) = \A\g^Hx)). 

Now, through a finite induction in /, 1 < i < d, let us demonstrate the following property 

P(0 = "t«(y) = myi^u:.,yd/yu..;yi)'i'i{yi)'i'2(y2)...'i'i(yir 

Initialisation : 

For i - I. The above notations lead us to ^'^^\y) - 'F(y) ^'|'|'j , since yi - aJx through the 

change in variables, i.e. ^^^\y) - *I'(y2, ■■■,yd/y\)^\iy\), by the very definition of conditional 
density. Hence, P(l) holds true. 

For / = 2. Since P(l) is true, we can write 
>I'(2)(3,) = qiW(y)I^^ by definition of^'^^Ky), 



= ny2, ...,yjy0'i'i(yi)^^, since P(l) is true, 
= my2, ...,3'^/yi)>Pi(3'i)|g, since ¥^\y2) = %(y2) , 
^ny3,-,yd/yuy2)'i'iiyMy2), 

by the very definition of conditional density. Thus, P(2) holds true. 
Going from / - 1 to / (/ < p): 

Let us assume P(/ - 1) is true, we can then show that P(/). 
>p(0(3,) ^ v|/(;-i)(3,)_*iteL^ by definition of ^^'Hy), 

= '^(yi, ...,yd/yi,..,yi-i)'i'i(yi)...'i'i-i(yi-i)^^y since P(/ - 1) is true, 

= ...,yd/yu ...,}',-i)»I'i(3'i)-*P,-i(3'/-i)|gJ, since ^^'""(y,) = ^Ky,) , 

= ^(yi+i, ...,yd/y\, .■■,yi)'i'\(yi)'i'2(y2)--^i(yi), by the very definition of conditional density. 
Thus, P(/) is true. 
Conclusion : 
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The induction principle enables us to infer that P(i) holds true for 1 < i < d. 

At present, since D^(^'-''\ *P) - 0, the above entails that ^'^''\y) - ^'(y), i.e. 

'i'O') ^ fp') 

'i'i0'i)*2tv2)-*<i0v) I'lO-imtej-frfOv) ■ 

We finally obtain g^f-j^C = d^f~a^C, where C and C are the respective copulas of *P and *P. 
Let us remark that, if the (a,) are the canonical basis of R'', we have 

where C/ and Cg are the respective copulas of / and g. □ 
Proof of proposition lC.il 

Let us first note that we will prove this proposition for k > 2, i.e. in the case where g** is not 
known. The initial case using the known density g**'^ = g, will be an immediate consequence of 
the above. 

Moreover, going forward, to be more legible, we will use g (resp. g„) in lieu of g**^"'^ (resp. 

We can therefore remark that we have f(Xj) > 6„- y„, g(Yj) > 6„ - y„ and gi,{b^Yi) > 6„ - y„, for 
all i and for all b e Rf, thanks to the uniform convergence of the kernel estimators. Indeed, we 
have f(Xi) = /(X,) - f„(Xi) + f„(Xi) > -y„ + /„(X,), by definition of y„, and then /(X,) > -y„ + 0„, 
by hypothesis on f„{Xi). This is also true for gn and gbn- 

This entails sup,,«. " h'(MjS4M-)j§id4 a.s. 

Indeed, let us remark that 

|iy« , n Uia'Y,) gJY,), UAa-Y,) r , g(x) Mb- x) . „/ ..n ^..i 

ln'^i=l'-r' '■g„„(a-Yi) UYi)l g„„(a-^Yi)l J ^ ^ f(x) giif^ x)> 6\-^> g„{a-^ x) "-^1 

- |iy« ,„/ f f'-M Yj) gn(Yi) . /„.„(a^ Y,) 1 y„ /, /„(a^F,0 gjYd . Ua'Yi) 

\n-^i=\'r gania^Yi) UYi)> ganicTYi) n'^i=l^ ga(a^Y,) f(Yi)> gJa^Yf) 

'^n'^i^lf ^Aa-^Yi) f(Yi)l gJa-Yi) J f ^ f(x) gt(b-^x)> li^^>g,(d'x) ""^l 
< |iy« ,y r Lnia'^Yj) g„(Yi) . /„.„(a^y,) 1 v„ gCFQ . i 

Moreover, since ^ Wi^^^j^r^) g(jc) ^"(^xf) 1^-^ < as implied by lemma 1531 and since we 
assumed g such that D^{g, /) < oo and D^pi f, g) < oo and since /? e 0^*, the law of large numbers 
enables us to state that |is;.L,^'{ggf§)g2;| - / ^'(flg^) gix)^^^ dx\ - a.s. 
Furthermore ^n' ! M£lil h^] - lyn , , U£M s(Yd] U£ni. 

- n^i=l^V> \g^_„(a-Yi) UY,)' ga.nia'-Y,) f ^g^a'^Y,) /(F,) J y,,) I 

and \^'{M0.s^}M01. _ ^'{M0imi}^0.i ^ as a resuh of the hypotheses initially 

inti-oduced on 0„. Consequently, iS" , I^'I^^^'^^TTFtI^^^^vt - -^'{^^^ttttI^^T^TtI ^ 0. as 
it is a Cesaro mean. This enables us to conclude. Similarly, we obtain 

sup,,K. J^l^g)) - / ^*(^'(l||||;£i))/(^)J^| a.s. □ 

Proof of theorem IrII 

Theorem F.l. Assuming that (HI) to (H3), (H6) and (H8) hold. Then, 

where k represents the k''' step of the algorithm and with I being the identity matrix in M.'^. 

Note that k is fixed in theorem lFTI since % = arg infaee sup^^g P„M(c, a) where M is a known 
function of k , f and g^''^^^ - see section ITTI 
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Proof. Through a Taylor development of P„M(c„(flj:), 7«) of rank 2, we get at point {at, aQ: 
P„M(c„(fli),r„) = ¥„M(ak,ak) + V„£M(ak,ak)(j„ - OkY + ¥„§^M(ak,ak)(Cn(ak) - OkY 
+ \\{yn - aky¥„-£g^M(ak,ak)(f„ - Ok) + (Cn(ak) - akyP„-£^M(ak,ak)(yn - ak) 

+(f« - akY^n^Miuk, ak)(c„(ak) - ak) + (c„(ak) - UkYf'n^Miak, akXcJak) - Ok)} 
The lemma below enables us to conclude. 

lemme F.12. Let H be an integrable function and let C — ^ H dP and C„ — J H d¥„, 
then, Cn-C = 

Thus we get P„M(c„(flt), f„) = ¥„M{ak, at) + Op(i), 
i.e. y[R{¥nM{cn{ak),yn) - PM(ak,ak)) = yfii(V„M(ak,ak) - FM(ak,ak)) + op(l). 
Hence ■\/n(P„M(c„{ak), y,,) - PM{ak, ak)) abides by the same limit distribution as 
yln{V„M{ak, ak) - PM{ak, ak)), which is N{0, Varp(M(ak, ak))). □ 

Proof of theorem l3.2l Through proposition lC. 1 l and theorem IFTI we derive theorem l3T2l □ 
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